A model for the synthesis of natural sounding vowels
نویسندگان
چکیده
A model has been developed which is designed to preserve some of the naturalness that is usually lost in speech synthesis. A parametrized function is used to produce an approximation to the cross-sectional rea through the glottis. A circuit model of the subglottal and glottal system is used with the supraglottal pressure to generate the glottal volume velocity. The tract used to obtain the supraglottal pressure is represented by its input-impedance impulse response, which can be calculated from the area function of the tract. A convolution of the input-impedance impulse response with the volume velocity determines the supraglottal pressure. The two coupled equations for the volume velocity are solved simultaneously. The output of the model is generated by convolving the resulting glottal volume velocity with the transfer-function impulse response of the tract. This technique preserves the interaction between the glottal flow and the vocal tract, which is usually lost. Comparisons are made between "complete tract loading" and "inductive tract loading." Magnitude spectra of the various pressures and the glottal volume velocity are examined in detail. Effects of varying the glottal parameters are examined for one vowel. Listening tests showed that vowels synthesized with the interaction were preferred as more natural sounding than those without the interactions.
منابع مشابه
Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-scale Modification
The current study investigates the synthesis and analysis of aspiration noise in synthesized and spoken vowels. Based on the linear source-filter model of speech production, we implement a vowel synthesizer in which the aspiration noise source is temporally modulated by the periodic source waveform. Modulations in the noise source waveform and their synchrony with the periodic source are shown ...
متن کاملAcoustic Vowel Analysis in a Mexican Spanish HMM-based Speech Synthesis
The synthetic voice produced from an HMM-based system is often reported as sounding muffled when it is compared to natural speech. There are several reasons for this effect: some precise and fine characteristics of the natural speech are removed, minimized or hidden in the modeling phase of the HMM system; the resulting speech-parameter trajectories become oversmoothed versions of the speech wa...
متن کاملAcoustic Analysis of Persian EFL Learners' Pronunciation of English Vowels
This paper reports the results of an experimental study on non-native production of English vowels. Two groups of Persian EFL learners varying in language proficiency were tested on their ability to produce the nine plain vowels of American English. Vowel production accuracy was assessed by means of acoustic measurements. Ladefoged and Maddison’s (1996) F1 F2 measurements for American English v...
متن کاملAutomatic synthesis of natural-sounding intonation for text-to-speech conversion in dutch
A set of rules is proposed for the automatic synthesis of natural-sounding intonation as part of speech synthesis in Dutch from unrestricted text. Results of a formal perceptual evaluation show that the synthetic intonation is judged to be as natural as human intonation for isolated utterances; for texts, additional provisions are required to model contributions of text structure. It is suggest...
متن کاملDuration Modeling by Multi-Models based on Vowel Production characteristics
An accurate estimation of segmental durations is needed for natural sounding textto-speech (TTS) synthesis. This paper propose multi-models based on production aspects of vowels. In this work four multi-models are developed based on vowel length, vowel height, vowel frontness and vowel roundness. In each multimodel, syllables are divided into groups based on specific vowel articulation characte...
متن کامل